SARIMA (Seasonal ARIMA): seasonality handling, differencing, and forecasting#
SARIMA extends ARIMA to explicitly model seasonal patterns (weekly, monthly, quarterly, …) by adding a second ARIMA component that operates at the seasonal lag s.
This notebook focuses on:
Precise seasonality handling in SARIMA
Seasonal vs non-seasonal components
When SARIMA is required (vs plain ARIMA)
Seasonal differencing and the tuple (p,d,q)(P,D,Q,s)
Intuition via decomposition-style plots and multi-season forecasts
A low-level NumPy implementation of the mechanics (differencing + lag polynomials + recursive forecasting)
Prerequisites#
AR / MA / ARMA basics
Stationarity and differencing
Lag operator notation (optional but helpful)
1) What does “seasonality” mean, and how does SARIMA handle it?#
A time series has seasonality when its behavior repeats with a fixed period s:
Deterministic seasonality: a stable repeating pattern in the mean (e.g., “December is always higher”).
Stochastic seasonality: seasonal effects are correlated across cycles (e.g., “this December depends on last December”).
SARIMA is designed to handle stochastic seasonality by combining:
Seasonal differencing
D(lags): removes seasonal unit roots / repeating seasonal levels.Seasonal AR terms
P: correlation at lagss, 2s, …, Ps.Seasonal MA terms
Q: how seasonal shocks propagate to lagss, 2s, …, Qs.
If seasonality is purely deterministic, alternatives can be simpler:
seasonal dummies (month-of-year indicators)
Fourier terms (smooth seasonality)
decomposition + ARIMA on the remainder
But when residuals still show seasonal autocorrelation, SARIMA is often the right tool.
2) Seasonal vs non-seasonal components (and the full SARIMA equation)#
Let L be the lag operator: L^k y_t = y_{t-k}.
Non-seasonal differencing (order d):
Seasonal differencing (order D, season length s):
Define the stationary transformed series:
A SARIMA(p,d,q)(P,D,Q,s) model is:
with (one common sign convention):
Interpretation of the tuple:
(p,d,q)controls short-lag (non-seasonal) dynamics and differencing.(P,D,Q,s)controls seasonal-lag dynamics and seasonal differencing.sis the number of observations per season (e.g.,12for monthly data with yearly seasonality).
Key intuition: SARIMA is multiplicative. The combined AR lag polynomial is the product \phi_p(L)\Phi_P(L^s), which introduces cross-lags like L^{s+1}.
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio
from plotly.subplots import make_subplots
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.statespace.sarimax import SARIMAX
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
rng = np.random.default_rng(0)
# --- Synthetic monthly data with clear seasonality ---
# We purposefully include:
# - deterministic seasonality (sin/cos)
# - stochastic seasonal dependence (AR at lag s)
# - a mild trend (to motivate non-seasonal differencing)
season_length = 12
n_years = 10
n_obs = n_years * season_length
t = np.arange(n_obs)
dates = pd.date_range("2010-01-01", periods=n_obs, freq="MS")
det_seasonal = 8.0 * np.sin(2 * np.pi * t / season_length) + 3.0 * np.cos(2 * np.pi * t / season_length)
trend = 0.10 * t
seasonal_noise = np.zeros(n_obs)
short_noise = np.zeros(n_obs)
eps_seasonal = rng.normal(0.0, 0.8, size=n_obs)
eps_short = rng.normal(0.0, 1.0, size=n_obs)
for i in range(n_obs):
short_noise[i] = (0.6 * short_noise[i - 1] if i > 0 else 0.0) + eps_short[i]
seasonal_noise[i] = (0.7 * seasonal_noise[i - season_length] if i >= season_length else 0.0) + eps_seasonal[i]
y = 50.0 + trend + det_seasonal + seasonal_noise + short_noise
series = pd.Series(y, index=dates, name="y")
series.head(), series.tail()
(2010-01-01 53.888172
2010-02-01 57.909024
2010-03-01 60.006114
2010-04-01 57.476490
2010-05-01 54.720164
Freq: MS, Name: y, dtype: float64,
2019-08-01 53.911005
2019-09-01 52.239645
2019-10-01 50.273459
2019-11-01 51.970637
2019-12-01 58.234690
Freq: MS, Name: y, dtype: float64)
fig = go.Figure()
fig.add_trace(go.Scatter(x=series.index, y=series.values, mode="lines", name="y"))
fig.update_layout(
title="Synthetic monthly series with seasonality",
xaxis_title="Date",
yaxis_title="Value",
template="plotly_white",
height=420,
)
fig.show()
3) Seasonal effects (visual intuition)#
Two complementary views are useful:
Within-season distribution: how does the typical value differ by month-of-year?
Seasonal decomposition intuition: treat the series as trend + seasonal + residual (an intuitive lens, not the SARIMA model itself).
Additive decomposition intuition:
df = series.to_frame()
df["month_num"] = df.index.month
df["month"] = df["month_num"].map(lambda m: pd.Timestamp(2000, m, 1).strftime("%b"))
month_order = [pd.Timestamp(2000, m, 1).strftime("%b") for m in range(1, 13)]
fig = px.box(
df,
x="month",
y="y",
category_orders={"month": month_order},
points="all",
title="Seasonal effect view: distribution by month-of-year",
)
fig.update_layout(template="plotly_white", height=420)
fig.show()
decomp = seasonal_decompose(series, model="additive", period=season_length)
fig = make_subplots(
rows=4,
cols=1,
shared_xaxes=True,
vertical_spacing=0.03,
subplot_titles=("Observed", "Trend (smoothed)", "Seasonal component", "Residual"),
)
fig.add_trace(go.Scatter(x=series.index, y=decomp.observed, mode="lines", name="Observed"), row=1, col=1)
fig.add_trace(go.Scatter(x=series.index, y=decomp.trend, mode="lines", name="Trend"), row=2, col=1)
fig.add_trace(go.Scatter(x=series.index, y=decomp.seasonal, mode="lines", name="Seasonal"), row=3, col=1)
fig.add_trace(go.Scatter(x=series.index, y=decomp.resid, mode="lines", name="Residual"), row=4, col=1)
fig.update_layout(
title="Seasonal decomposition intuition (additive)",
template="plotly_white",
height=900,
showlegend=False,
)
fig.show()
4) Seasonal differencing (and why it matters)#
Seasonal differencing compares each observation to the one one full season ago:
For monthly data with yearly seasonality (s=12), this is literally “this month minus the same month last year”.
A common practical pattern:
d=1removes a slow trend / random walk-like behavior.D=1removes a seasonal unit root (seasonal random walk-like behavior).
Example (combined differencing with d=1, D=1):
def difference(x: np.ndarray, *, lag: int = 1, order: int = 1) -> np.ndarray:
"""Return the `order`-times differenced series with step `lag`.
Examples:
- lag=1, order=1: x_t - x_{t-1}
- lag=s, order=1: x_t - x_{t-s}
"""
x = np.asarray(x, dtype=float)
out = x
for _ in range(order):
out = out[lag:] - out[:-lag]
return out
y = series.values
d1 = difference(y, lag=1, order=1)
Ds1 = difference(y, lag=season_length, order=1)
d1_Ds1 = difference(difference(y, lag=1, order=1), lag=season_length, order=1)
idx_d1 = series.index[1:]
idx_Ds1 = series.index[season_length:]
idx_d1_Ds1 = series.index[season_length + 1 :]
fig = make_subplots(
rows=4,
cols=1,
shared_xaxes=False,
vertical_spacing=0.04,
subplot_titles=(
"Original series",
"Non-seasonal difference (d=1)",
f"Seasonal difference (D=1, s={season_length})",
f"Combined (d=1, D=1, s={season_length})",
),
)
fig.add_trace(go.Scatter(x=series.index, y=y, mode="lines", name="y"), row=1, col=1)
fig.add_trace(go.Scatter(x=idx_d1, y=d1, mode="lines", name="d=1"), row=2, col=1)
fig.add_trace(go.Scatter(x=idx_Ds1, y=Ds1, mode="lines", name="D=1"), row=3, col=1)
fig.add_trace(go.Scatter(x=idx_d1_Ds1, y=d1_Ds1, mode="lines", name="d=1,D=1"), row=4, col=1)
fig.update_layout(
title="Differencing removes trend and/or seasonality",
template="plotly_white",
height=900,
showlegend=False,
)
fig.show()
5) When is SARIMA required?#
You typically need SARIMA (or another seasonal model) when:
The series has a stable period
sand forecasts must respect that cycle.After handling trend (
d) the residuals still show seasonal autocorrelation at lagss, 2s, ....A plain ARIMA forecast systematically misses recurring peaks/troughs.
Seasonal differencing (
D=1) is needed to make the series look stationary.
If seasonality is deterministic and well-explained by calendar features, SARIMAX (with regressors) or regression + ARMA errors can be a better fit. If seasonality is stochastic (depends on past seasons), SARIMA is often more appropriate.
6) The parameter tuple (p,d,q)(P,D,Q,s)#
p: non-seasonal AR order (lags1..p)d: number of non-seasonal differencesq: non-seasonal MA order (lags1..q)P: seasonal AR order (lagss, 2s, …, Ps)D: number of seasonal differences (lags)Q: seasonal MA order (lagss, 2s, …, Qs)s: season length
Multiplicative structure means you can get cross-lags. Example:
For SARIMA(1,0,0)(1,0,0,s), the AR polynomial expands as:
So the effective AR part involves lags 1, s, and also s+1.
phi_1 = 0.4
Phi_1 = 0.6
s = season_length
ar_ns = np.array([1.0, -phi_1])
ar_s = np.zeros(s + 1)
ar_s[0] = 1.0
ar_s[s] = -Phi_1
combined = np.convolve(ar_ns, ar_s)
nonzero_lags = [(lag, coef) for lag, coef in enumerate(combined) if abs(coef) > 1e-12]
nonzero_lags
[(0, 1.0), (1, -0.4), (12, -0.6), (13, 0.24)]
7) Low-level NumPy implementation (mechanics)#
Below is a deliberately low-level implementation of SARIMA mechanics:
Build the combined AR and MA lag polynomials via convolution.
Compute one-step-ahead predictions via an innovations recursion (conditional residuals).
Produce multi-step forecasts by setting future innovations to zero (mean forecast).
Undo differencing to return forecasts to the original scale.
This is meant for learning (how the model works), not as a drop-in replacement for a production-grade state-space implementation.
def _ar_poly(phi: np.ndarray, Phi: np.ndarray, s: int) -> np.ndarray:
"""Return AR lag polynomial coefficients for phi(L)*Phi(L^s).
Convention: phi(L) = 1 - phi1 L - ... and Phi(L^s) = 1 - Phi1 L^s - ...
"""
phi = np.asarray(phi, dtype=float)
Phi = np.asarray(Phi, dtype=float)
ar_ns = np.empty(len(phi) + 1)
ar_ns[0] = 1.0
ar_ns[1:] = -phi
ar_s = np.zeros(len(Phi) * s + 1)
ar_s[0] = 1.0
for j, Phi_j in enumerate(Phi, start=1):
ar_s[j * s] = -Phi_j
return np.convolve(ar_ns, ar_s)
def _ma_poly(theta: np.ndarray, Theta: np.ndarray, s: int) -> np.ndarray:
"""Return MA lag polynomial coefficients for theta(L)*Theta(L^s).
Convention: theta(L) = 1 + theta1 L + ... and Theta(L^s) = 1 + Theta1 L^s + ...
"""
theta = np.asarray(theta, dtype=float)
Theta = np.asarray(Theta, dtype=float)
ma_ns = np.empty(len(theta) + 1)
ma_ns[0] = 1.0
ma_ns[1:] = theta
ma_s = np.zeros(len(Theta) * s + 1)
ma_s[0] = 1.0
for j, Theta_j in enumerate(Theta, start=1):
ma_s[j * s] = Theta_j
return np.convolve(ma_ns, ma_s)
def difference_with_forecast_history(y: np.ndarray, *, d: int, D: int, s: int) -> tuple[np.ndarray, list[float], list[np.ndarray]]:
"""Apply differences to y and store the end-of-sample values needed to invert them for forecasting."""
x = np.asarray(y, dtype=float)
last_values_nonseasonal: list[float] = []
for _ in range(d):
last_values_nonseasonal.append(float(x[-1]))
x = x[1:] - x[:-1]
last_values_seasonal: list[np.ndarray] = []
for _ in range(D):
last_values_seasonal.append(x[-s:].copy())
x = x[s:] - x[:-s]
return x, last_values_nonseasonal, last_values_seasonal
def invert_forecast_differences(
x_future: np.ndarray,
*,
d: int,
D: int,
s: int,
last_values_nonseasonal: list[float],
last_values_seasonal: list[np.ndarray],
) -> np.ndarray:
"""Invert differencing for future points using stored end-of-sample values."""
x = np.asarray(x_future, dtype=float)
# Undo seasonal differencing in reverse order.
for j in range(D - 1, -1, -1):
history = list(np.asarray(last_values_seasonal[j], dtype=float)) # length s
out = np.empty_like(x)
for i, val in enumerate(x):
next_val = val + history[-s]
history.append(next_val)
out[i] = next_val
x = out
# Undo non-seasonal differencing in reverse order.
for j in range(d - 1, -1, -1):
running = float(last_values_nonseasonal[j])
out = np.empty_like(x)
for i, val in enumerate(x):
running = running + val
out[i] = running
x = out
return x
def arma_innovations(w: np.ndarray, *, ar_poly: np.ndarray, ma_poly: np.ndarray, constant: float = 0.0) -> tuple[np.ndarray, np.ndarray]:
"""Compute one-step-ahead predictions and innovations (conditional residuals).
Model form in lag polynomials:
ar_poly(L) w_t = constant + ma_poly(L) eps_t
with ar_poly[0]=1 and ma_poly[0]=1.
"""
w = np.asarray(w, dtype=float)
K = len(ar_poly) - 1
M = len(ma_poly) - 1
w_hat = np.zeros_like(w)
eps = np.zeros_like(w)
for t in range(len(w)):
ar_term = 0.0
for k in range(1, K + 1):
if t - k >= 0:
ar_term += (-ar_poly[k]) * w[t - k]
ma_term = 0.0
for k in range(1, M + 1):
if t - k >= 0:
ma_term += ma_poly[k] * eps[t - k]
w_hat[t] = constant + ar_term + ma_term
eps[t] = w[t] - w_hat[t]
return w_hat, eps
def arma_forecast(
w: np.ndarray,
eps: np.ndarray,
*,
ar_poly: np.ndarray,
ma_poly: np.ndarray,
steps: int,
constant: float = 0.0,
) -> np.ndarray:
"""Multi-step mean forecast: future eps are set to 0."""
K = len(ar_poly) - 1
M = len(ma_poly) - 1
w_ext = list(np.asarray(w, dtype=float))
eps_ext = list(np.asarray(eps, dtype=float))
out = np.zeros(steps, dtype=float)
for i in range(steps):
t = len(w_ext)
ar_term = 0.0
for k in range(1, K + 1):
ar_term += (-ar_poly[k]) * w_ext[t - k]
ma_term = 0.0
for k in range(1, M + 1):
ma_term += ma_poly[k] * eps_ext[t - k]
w_next = constant + ar_term + ma_term
out[i] = w_next
w_ext.append(w_next)
eps_ext.append(0.0)
return out
# --- Practical usage: fit SARIMA with statsmodels, then replicate mean forecasts with NumPy ---
train = series.iloc[:-24]
test = series.iloc[-24:]
# A common seasonal choice for monthly data.
order = (1, 1, 1)
seasonal_order = (1, 1, 1, season_length)
model = SARIMAX(
train,
order=order,
seasonal_order=seasonal_order,
trend="c", # allow drift (helps when d>0 and the series has trend)
enforce_stationarity=False,
enforce_invertibility=False,
)
res = model.fit(disp=False)
steps = len(test)
fc_sm = res.get_forecast(steps=steps)
fc_sm_mean = fc_sm.predicted_mean
fc_sm_ci = fc_sm.conf_int()
# Extract parameters needed for our low-level recursion.
p, d, q = order
P, D, Q, s = seasonal_order
params = res.params
phi = np.array([params.get(f"ar.L{i}", 0.0) for i in range(1, p + 1)], dtype=float)
theta = np.array([params.get(f"ma.L{i}", 0.0) for i in range(1, q + 1)], dtype=float)
Phi = np.array([params.get(f"ar.S.L{(i * s)}", 0.0) for i in range(1, P + 1)], dtype=float)
Theta = np.array([params.get(f"ma.S.L{(i * s)}", 0.0) for i in range(1, Q + 1)], dtype=float)
constant = float(params.get("intercept", params.get("const", 0.0)))
ar_poly = _ar_poly(phi, Phi, s)
ma_poly = _ma_poly(theta, Theta, s)
# Difference training data ourselves and keep the history needed to invert.
w_train, last_non, last_seas = difference_with_forecast_history(train.values, d=d, D=D, s=s)
# Conditional innovations recursion on the differenced series.
w_hat, eps = arma_innovations(w_train, ar_poly=ar_poly, ma_poly=ma_poly, constant=constant)
# Mean forecast in the differenced space.
w_future = arma_forecast(w_train, eps, ar_poly=ar_poly, ma_poly=ma_poly, steps=steps, constant=constant)
# Undo differencing to get forecasts on the original scale.
y_future_numpy = invert_forecast_differences(
w_future,
d=d,
D=D,
s=s,
last_values_nonseasonal=last_non,
last_values_seasonal=last_seas,
)
freq = pd.infer_freq(train.index) or "MS"
forecast_index = pd.date_range(train.index[-1], periods=steps + 1, freq=freq)[1:]
fc_np = pd.Series(y_future_numpy, index=forecast_index, name="forecast_numpy")
res.summary().tables[0]
| Dep. Variable: | y | No. Observations: | 96 |
|---|---|---|---|
| Model: | SARIMAX(1, 1, 1)x(1, 1, 1, 12) | Log Likelihood | -116.042 |
| Date: | Sat, 31 Jan 2026 | AIC | 244.084 |
| Time: | 12:34:56 | BIC | 257.488 |
| Sample: | 01-01-2010 | HQIC | 249.402 |
| - 12-01-2017 | |||
| Covariance Type: | opg |
fig = go.Figure()
fig.add_trace(go.Scatter(x=train.index, y=train.values, mode="lines", name="Train"))
fig.add_trace(go.Scatter(x=test.index, y=test.values, mode="lines", name="Test (truth)", line=dict(color="black")))
# Statsmodels forecast + CI band
fig.add_trace(go.Scatter(x=fc_sm_mean.index, y=fc_sm_mean.values, mode="lines", name="SARIMA forecast (statsmodels)"))
lower = fc_sm_ci.iloc[:, 0].values
upper = fc_sm_ci.iloc[:, 1].values
x_ci = np.concatenate([fc_sm_mean.index.values, fc_sm_mean.index.values[::-1]])
y_ci = np.concatenate([upper, lower[::-1]])
fig.add_trace(
go.Scatter(
x=x_ci,
y=y_ci,
fill="toself",
fillcolor="rgba(0, 102, 204, 0.15)",
line=dict(color="rgba(0,0,0,0)"),
hoverinfo="skip",
name="95% CI (statsmodels)",
)
)
# NumPy recreation of the mean forecast (should be close, not necessarily identical)
fig.add_trace(
go.Scatter(
x=fc_np.index,
y=fc_np.values,
mode="lines",
name="Mean forecast (NumPy mechanics)",
line=dict(dash="dash"),
)
)
fig.update_layout(
title="Forecasts across seasons (multi-step ahead)",
xaxis_title="Date",
yaxis_title="Value",
template="plotly_white",
height=520,
)
fig.show()
8) Pitfalls + diagnostics#
Wrong season length
s: if the true period is weekly but you set monthly, the seasonal terms will not help.Over-differencing: too large
d/Dcan introduce unnecessary MA behavior and inflate variance.Deterministic vs stochastic seasonality: if seasonality is explained by calendar dummies, SARIMAX (with regressors) can be simpler.
Check residuals: after fitting, residual ACF should not show spikes at
s, 2s, ….Transformations: multiplicative seasonality is often handled via
log(y)then additive SARIMA.
9) Exercises#
Change
season_lengthto7and regenerate a synthetic weekly pattern.Try a simpler model, e.g. SARIMA(1,1,0)(1,1,0,12), and compare forecasts.
Remove the deterministic seasonal term and keep only stochastic seasonal dependence; does
D=1still help?Add Fourier terms as regressors (SARIMAX) and compare to a seasonal AR/MA specification.
References#
Box, Jenkins, Reinsel, Ljung — Time Series Analysis: Forecasting and Control
Hyndman & Athanasopoulos — Forecasting: Principles and Practice (seasonal ARIMA chapter)
statsmodels SARIMAX documentation